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LONG-TERM  GOALS 

An  over-arching  goal  in  prediction  science  is  to  objectively  improve  numerical  models  of  nature. 
Meeting  that  goal  requires  objective  quantification  of  deficiencies  in  our  models.  The  structural 
differences  between  a  numerical  model  and  a  true  system  are  difficult  to  ascertain  in  the  presence  of 
multiple  sources  of  error.  Numerical  weather  prediction  (NWP)  is  subject  to  temporally  and  spatially 
varying  error,  resulting  from  both  imperfect  atmospheric  models  and  the  chaotic  growth  of  initial- 
condition  (IC)  error.  The  aim  of  our  work  is  to  provide  new  methods  that  begin  to  systematically 
disentangle  the  model  inadequacy  signal  from  the  initial  condition  error  signal. 

OBJECTIVES 

We  are  engaging  a  comprehensive  effort  that  uses  state-of-the-science  estimation  methods  in  data 
assimilation  (DA)  and  statistical  modeling,  including:  (1)  the  characterization  of  existing  model-to- 
model  differences  via  heirarchical  spatial  modeling  methods;  (2)  the  development  of  a  flexible 
representation  for  the  various  spatial  and  temporal  scales  of  model  error;  (3)  the  estimation  of 
parameters  to  represent  those  scales  using  a  probabilistic  approach  to  DA,  namely  the  Ensemble 
Kalman  Filter;  and  (4)  the  determination  of  whether  incorporation  of  estimated  error  structure  in 
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improves  short-term  forecasts,  again  using  heirarchical  methods,  this  time  within  a  fonnal  testing 
framework.  Research  focus  is  on  near-surface  winds  over  both  the  ocean  and  land.  The  method  under 
development  are  sufficiently  general  and  can  apply  to  a  wide  range  of  battlespace  environments. 

APPROACH 

The  technical  approach  includes  numerical  weather  prediction  and  state  estimation  efforts  at  NPS,  and 
statistical  modeling  efforts  at  University  of  California  Berkeley  (UCB)  under  sub-contract.  At  NPS  PI 
Hacker  and  post-doc  Kolczynski  are  implementing  the  Navy’s  Operational  Global  Atmospheric 
Prediction  (NOGAPS),  and  two  limited-area  mesoscale  models:  the  Navy’s  Coupled  Ocean- 
Atmosphere  Mesoscale  Prediction  System  (COAMPS)  model,  and  the  open-source  Weather  Research 
and  Forecast  (WRF)  model,  within  a  state-of-the-science  ensemble  Data  Assimilation  Research 
Testbed  (DART).  The  NOGAPS-DART  provides  global  ensemble  prediction  capability  that  can  be 
consistently  applied  to  the  COAMPS  and  WRF  as  lateral  boundary  conditions.  Scientific  objectives 
will  be  met  by  systematically  choosing  the  WRF  or  COAMPS  as  the  “truth,”  which  can  provide 
observations  for  assimilating  into  the  other  model.  Under  this  approach,  spatio-temporal  distributions 
of  uncertainty  (error  in  this  context)  are  available  for  analysis  with  special  attention  to  second-order 
moments.  Eventually,  we  will  use  the  same  framework  to  objectively  estimate  parameters  in  statistical 
models,  of  NWP  model  error,  developed  at  UCB.  Hypotheses  are  being  fonned  and  formally  tested. 

UCB  PI  Cari  Kaufman  is  working  to  advance  the  statistical  methods  needed  to  provide  an  objective 
space-time  characterization  of  the  error  distributions.  Uncertainty  is  characterized  via  fitting  a 
hierarchical  Bayesian  model  that  captures  the  important  features  and  variability  in  the  data.  The 
implied  distribution  from  the  model  will  be  a  valid  stochastic  spatial  process  under  probability  theory. 
Ideally,  fitting  the  statistical  model  to  different  datasets  should  allow  us  to  capture  the  significant 
differences  between  the  different  underlying  data  generating  distributions.  Moreover,  a  realistic 
statistical  model  can  also  simulate  realistic  wind  fields  quickly  which  can  be  beneficial  for  studying 
other  processes  that  require  surface  winds  as  an  input.  Graduate  student  Wayne  Lee  (unfunded)  is 
contributing  substantially  to  this  work.  Postdoc  Benjamin  Shaby  began  work  in  September  2011  and 
completed  his  appointment  in  Feb.  2013. 

WORK  COMPLETED 

Continued  funding  lags  led  to  a  significant  reallocation  of  resources  toward  different  tasks  in  FY2013. 
Work  focused  on  the  most  promising  aspects  of  earlier  work,  and  leveraged  other  related  projects. 
Tasks  requiring  ambitious  technical  development  work  prior  to  scientific  discovery  were  abandoned. 
No  significant  work  was  wasted,  and  instead  the  technical  development  through  FY2013  were 
leveraged  to  obtain  meaningful  results.  The  result  was  a  focus  on  Tasks  4  and  5,  with  modified 
technical  details  underlying  each. 

At  NPS,  analysis  of  the  WRF-DART  state  estimates  and  predictions,  driven  from  NOGAPS-DART 
and  the  radiosonde  network  during  an  Oct  2009  simulation  period,  continued  with  Self-Organizing 
Maps  (SOMs)  adapted  to  understand  model  errors.  SOMs  have  emerged  during  the  last  decade, 
provide  an  objective  method  for  classification,  and  have  not  been  expanded  to  use  in  predictability  and 
model  inadequacy.  Systematic  increments  reveal  time-  and  space-dependent  systematic  errors,  while 
SOMs  provide  a  method  for  categorizing  forecasts  or  increment  patterns.  The  SOMs  can  be  either  used 
for  direct  analysis,  or  used  to  produce  composites  of  other  fields.  This  study  uses  the  forecasts  and 
increments  of  2-m  temperature  and  dry  column  mass  perturbation  (p)  over  a  four-week  period  to 
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demonstrate  the  potential  of  this  technique.  Results  demonstrate  the  potential  of  this  technique  for 
identifying  spatially  varying  systematic  model  errors.  Results  from  FY2013  are  presented  below.  NPS 
also  began  work  to  extend  the  error  estimation  and  parameter  estimation  experiments  with  the  single¬ 
column  version  of  the  WRF  model  in  DART,  to  3-dimensional  experiments. 

At  UCB  focus  was  on  addressing  challenges  associated  with  applying  heirarchical  Bayesian  techniques 
to  large,  multivariate,  and  non-stationary  datasets  typical  of  NWP.  Methods  proposed  and  tested 
during  FY12  were  applied  to  real  data  in  FY13.  The  first  is  incorporation  of  the  geostrophic 
relationship  into  a  hierarchical  model  to  relate  surface  winds  to  pressure  gradients,  allowing  the 
geostrophic  coefficients  to  vary  spatially,  and  the  use  of  a  Gaussian  Markov  random  field  (GMRF) 
approximation  to  speed  computations.  Results  from  application  to  40  years  of  model  data  confirm  that 
physical  constraints,  such  as  geostrophy,  applied  to  a  heirarchical  model  can  help  identify  dependence 
structures.  Second,  a  hierarchical  Bayesian  model  explicitly  separates  error  fields  occurring  at  different 
time  scales  in  sea-surface  temperature  fields,  representing  a  more  informative  extension  of  empirical 
orthogonal  functions  (EOFs).  With  future  work,  we  expect  these  methods  to  merge  with  work  at  NPS. 

RESULTS 

1.  Mesoscale  model  results 

Systematic  increments  from  data  assimilation  are  a  linear  function  of  model  errors  integrated  over  the 
assimilation  interval.  The  challenge  is  to  intepret  the  increments  in  space  and  time  to  reveal  the  scales 
of  model  errors.  We  have  recently  examined  Self-Organizing  Maps  (SOMs;  Kohonen,  1988)  as  a  tool 
for  identifying  the  coherent  systematic  error  structures.  SOMs  are  unattended  machine-learning 
algorithms  that  produce  low-dimensional  “maps”  of  possible  state  vectors  called  “nodes”  organized  in 
a  way  so  that  nearby  nodes  are  similar.  The  location  of  the  nodes  is  specified  and  may  be  arranged  in 
any  pattern,  though  rectangular  and  hexagonal  grids  are  most  common.  The  method  has  been  used  for 
cluster  analysis  in  the  past,  but  not  been  used  in  a  data  assimilation  context  or  to  understand  model 
inadequacy.  We  have  applied  SOMs  to  the  ensemble-mean  increment  in  meteorological  fields 
produced  by  DART -N O GAP S -  WRF . 

SOMs  are  produced  through  an  iterative  process.  During  each  iteration  a  random  state  is  chosen  from 
the  training  dataset.  The  random  state  is  compared  to  each  node  to  identify  the  node  closest  to  the 
chosen  state  based  on  a  cost  function  (often  the  root-mean-squared  error).  That  node  is  adjusted  closer 
to  the  random  state.  Nearby  nodes  are  also  adjusted  closer  to  the  same  random  state  (to  a  lesser 
degree).  The  magnitude  of  the  adjustment  and  the  size  of  the  neighborhood  are  both  reduced  with  each 
iteration,  so  that  initially  large  changes  are  made  to  a  large  portion  of  the  map  become  small  changes  to 
individual  nodes. 

This  particular  study  explored  a  technique  for  identifying  model  error  using  self-organizing  maps 
(SOMs)  and  analysis  increments.  Because  increments  quantify  the  changes  in  the  model  state  made  by 
the  assimilation  system  to  more  closely  represent  observations,  they  are  a  measure  of  the  forecast  error. 
The  SOM  wass  used  to  categorize  forecasts  or  increments,  which  can  then  be  analyzed  directly  or  used 
to  make  composites  of  other  fields.  SOMs  objectively  condition  or  classify  the  data  set  based  on  flow 
characteristics  contained  within  a  subset  of  the  model  state. 

Here  we  focused  primarily  on  two  forecast  variables:  2-m  temperature  and  dry  column  mass 
perturbation  (p)  over  a  roughly  four-week  period.  SOMs  produced  using  the  forecasts  of  each  exhibit 
readily  identifiable  patterns.  The  temperature  forecast  SOM  primarily  separated  by  time  of  day, 
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presumably  driven  by  diurnal  heating.  In  contrast,  the  synoptically  dominated  p  forecast  separated  into 
multi-day  periods  reflecting  the  synoptic  pattern. 

Composites  of  analysis  increments  for  each  SOM  demonstrate  possible  model  and/or  data  assimilation 
deficiencies.  Composite  increments  of  p  for  the  temperature  forecast  SOM  are  generally  negative,  and 
are  more  negative  for  the  colder  SOM  nodes.  Composite  increments  of  p  for  the  p  forecast  SOM 
indicate  a  potential  problem  in  the  model  system  with  ridge  and  trough  formation  over  central  North 
America.  An  example  is  given  in  Fig  1 . 


Composite  of  2-m  temperature  increments  on  p  forecast  SOM 


Composite  of  p  increments  on  p  forecast  SOM 


Fig.  1:  Composites  of  (a)  dry  air  mass  (p)  increments  and  (b)  2-m  temperature  increments  for  the 
dry  air  mass  (p)  forecast  SOM.  Frequency  counts  above  each  panel  refer  to  the  number  of  forecasts 
composited  in  each  panel.  The  feature  that  stands  out  most  in  the  p  increment  composite  for  the  p 
forecast  SOM  (Fig.  la)  is  the  large  negative  increment  over  central  North  America  in  the  composite 
for  node  (0,2).  The  p  forecast  SOM for  that  node  exhibits  ridging  over  central  North  America,  and 
the  composite  increment  is  strongly  negative.  The  combination  suggests  that  the  model 
systematically  over-amplifies  the  ridge  in  the  lee  of  the  Rockies. 
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This  research  represents  only  a  preliminary  examination  of  this  method  and  there  is  much  room  for 
extension  and  augmentation.  One  possibility  is  to  include  additional  infonnation  related  to  the 
ensemble  distribution  to  the  SOM.  For  instance,  SOMs  could  model  the  mean  and  variance  of  an 
ensemble,  adding  forecast  uncertainty  to  the  analysis.  Another  possibility  is  the  use  of  multi-variable 
SOMs,  combining  multiple  variables  into  a  single  categorization  of  forecasts  or  increments.  There  are 
also  the  obvious  extensions  of  the  presented  technique  such  as  a  longer  analysis  period,  additional 
variables,  and  different  SOM  node  configurations. 

In  a  separate  study  leveraging  other  ongoing  work,  extended  the  parameter  estimation  work  reported  in 
Hacker  and  Angevine  (2013)  from  a  single  column  to  3D.  Experiments  with  the  single-column 
implementation  of  the  WRF  model  provide  a  basis  for  deducing  land-atmosphere  coupling  errors  in  the 
model.  Coupling  occurs  both  through  heat  and  moisture  fluxes  through  the  land-atmosphere  interface 
and  roughness  sub-layer,  and  turbulent  heat,  moisture,  and  momentum  fluxes  through  the  atmospheric 
surface  layer.  Thermal  fluxes  are  directly  affected  by  the  aerodynamic  roughness  temperature,  which  is 
neither  observable  nor  available  from  energy  balance  solutions  at  the  surface  of  the  earth.  Often,  a 
constant  parameter  detennines  the  thennal  roughness  as  a  contant  multiple  of  the  momentum 
roughness  length.  Hacker  and  Angevine  (2013)  showed  that  state  augmentation  in  ensemble  data 
assimilation  with  a  single  column  model  both  provide  time-varying  estimates  of  the  parameter  value, 
and  also  result  in  reduced  systematic  model  error. 

The  Extension  to  3D  allows  a  spatial  characterization  of  the  parametric  error  controllong  the  turbulent 
fluxes.  Results  are  forthcoming,  but  sensitivity  tests  performed  by  changing  the  domain-constant  value 
of  the  parameter  from  0.01  to  1.0  (a  plausible  range  based  on  a  literature  review)  show  significant 
spatial  variability.  Given  that  forecast  biases  are  often  state  dependent,  we  might  expect  significant 
variability  in  the  parameter  estimates. 

2.  Modeling  Uncertainty  in  Surface  Wind  Fields 

We  continued  our  work  to  construct  a  probabilistic  model  to  characterize  the  dependence  structures  in 
surface  wind  fields.  In  previous  reports,  we  described  our  incorporation  of  the  geostrophic  relationship 
into  a  hierarchical  model  to  relate  surface  winds  to  pressure  gradients,  allowing  the  geostrophic 
coefficients  to  vary  spatially,  and  the  use  of  a  Gaussian  Markov  random  field  (GMRF)  approximation 
to  speed  computations.  Our  work  this  year  has  been  to  demonstrate  and  evaluate  our  model  using  the 
Japanese  Model  MIROC3.2  at  medium  resolution  under  the  pre -industrial  experiment  scenario.  In  this 
analysis,  we  work  with  average  wind  fields  over  each  season  for  each  year.  This  yields  40  years  of 
average  surface  wind  fields  for  winter  and  summer. 

To  illustrate  these  results,  Figs.  2  and  3  show  the  posterior  means  for  the  spatially  varying  coefficients 
for  the  winter  wind  fields.  One  quick  sanity  check  is  the  agreement  with  the  geostrophic  relationship. 
Given  that  the  Coriolis  force  switches  sign  at  the  equator,  the  sign  change  for  the  geostrophic 
coefficient  at  the  equator  is  promising.  Another  promising  aspect  is  the  fact  that  the  geostrophic 
coefficients  are  larger  in  magnitude  over  the  ocean  than  land.  This  confirms  our  understanding  about 
the  effect  of  friction  on  wind  velocity.  The  coefficient  process  also  shows  a  discontinuous  behavior 
when  the  topography  changes  between  land  and  sea,  which  is  a  feature  we  wanted  to  capture  in  this 
model.  Another  interesting  fact  is  the  ageostrophic  pressure  gradient  is  consistently  negative  if  not 
zero. 

Besides  the  agreement  with  the  geostrophic  effect,  prediction  accuracy  is  also  useful  to  help  evaluate 
models.  Figure  4  shows  the  predicted  wind  velocities  v.s\  the  actual  wind  velocities  in  10  years  of  data 
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set  aside  for  testing.  Relative  to  the  overall  variability  in  the  wind  components,  the  prediction  error  is 
quite  small. 


Winter  U:  p0  Winter  U:  pi  Winter  U:  p2 


Fig.  2:  Posterior  means  of  model  coefficients  for  U  component  of  wind  for  winter  data. 
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Winter  V:  p0 


Winter  V:  p! 


Winter  V:  p2 


Fig.  3:  Posterior  means  of  model  coefficients  for  V  component  of  wind  for  winter  data. 
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Velocity  for  U 


Velocity  for  V 


True  Velocity 


True  Velocity 


Fig.  4:  Surface  prediction  vs.  true  wind  velocity  (m/s)  for  10  years  of  data. 

3.  Separating  spatial  scales  with  model-based  EOFs 

We  created  a  new  hierarchical  Bayesian  model  to  explicitly  separate  error  fields  that  occur  at  different 
time  scales.  The  model  is  an  extension  of  traditional  EOFs  to  incorporate  spatial  and  temporal 
dependence.  As  described  in  our  last  report,  the  main  building  block  for  our  models  is  the  observation 
of  Tipping  and  Bishop  (1999)  that  EOF  construction  is  equivalent  to  maximizing  a  particular 
probability  model  with  respect  to  the  data.  EOF  computations  decompose  errors  into  p  spatial  fields, 
which  we  denote  as  mi,...,mp,  each  scaled  by  p  time  series,  which  we  denote  as  zi,...,zp.  The  standard 
probability  model  corresponding  to  EOFs  assumes  that  each  spatial  field  m;  is  independent  across 
locations  and  that  each  time  series  z;  is  independent  through  time.  Our  Bayesian  method  works  by 
assigning  a  prior  distribution  to  each  m;  that  encourages  nearby  locations  to  behave  similarly,  and  a 
prior  distribution  on  each  z\  to  encourage  temporal  structures  that  exhibit  characteristic  frequencies. 
These  prior  distributions  are  modeled  as  Gaussian  processes  with  prescribed  covariance  structures. 

Previously  we  applied  this  model  to  simulated  data.  We  have  now  applied  it  to  sea  surface  temperature 
data  and  compared  the  results  to  a  traditional  EOF  analysis.  As  shown  in  Figs.  5  and  6,  the  results  are 
similar,  but  the  hierarchical  clearly  separates  two  very  different  scales  of  temporal  variability.  Another 
main  advantage  of  the  hierarchical  model  is  that  locations  that  have  some  missing  data  can  still  be 
included,  whereas  these  must  be  thrown  out  in  traditional  EOF  analysis.  This  can  be  seen  for  some 
Antarctic  locations  in  Fig.  5. 
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m-|  m2 


Fig.  5:  Posterior  mean  fields  and  EOFs.  Note  that  the  ordering  of  the  posterior  mean  fields,  unlike 

EOFs,  is  arbitrary. 
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Fig.  6:  Time  series  from  hierarchical  model  (posterior  means)  and  EOF  analysis. 
IMPACT/APPLICATIONS 

The  bulk  of  DoN  day-to-day  operations  rely  on  accurate  predictions  of  winds,  seas,  ceiling,  and 
visibility.  The  focus  of  the  proposed  work  is  to  identify  inadequacies  associated  with  the  modeled 
atmospheric  boundary  layer.  Any  discoveries  that  enable  the  improvement  of  boundary  layer 
modeling  will  ultimately  have  a  positive  impact  on  Navy  warfighters. 

The  proposed  methods  have  the  potential  to  enable  essential  improvement  in  modeling  capability. 
Instead  of  tuning  models  based  on  intuition,  we  are  forming  a  foundation  for  objective  identification  of 
model  errors.  Those  errors  could  immediately  be  accounted  for  in  probablistic  forecast  systems,  and 
also  be  subject  to  physical  interpretation  by  subject  experts. 
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RELATED  PROJECTS 


The  MATERHORN  project  (http://www.nd.edu/~dynamics/materhom/index.html  ),  funded  by  ONR, 
seeks  to  improve  atmospheric  predictability  over  complex  terrain.  It  is  similarly  focused  on 
predictions  in  the  atmospheric  boundary  layer.  Rather  than  a  focus  on  model  inadequacy, 
MATERHORN  focuses  on  field  programs  aimed  at  improving  models  via  direct  comparison  to 
observations,  and  quantifying  optimal  observing  strategies  for  improving  predictions.  PI  Hacker  is 
using  some  of  the  technical  developments  here  to  aid  that  effort,  and  vice  versa. 

REFERENCES 

Hacker,  J.  P.,  W.  M.  Angevine,  2013:  Ensemble  Data  Assimilation  to  Characterize  Surface-Layer 
Errors  in  Numerical  Weather  Prediction  Models.  Mon.  Wea.  Rev.,  141,  1804-1821. 

Kohonen,  T.,  1988:  Self-organization  and  associative  memory.  Springer- Verlag,  New  York,  312  pp. 

Tipping,  M.E.  and  C.M.  Bishop,  1999:  Probabilistic  principal  component  analysis.  J.  Roy.  Stat.  Soc., 
Series  B:  Methodology,  61,  611-622. 

PUBLICATIONS 

Hacker,  J.  and  W.  Angevine,  2013:  Ensemble  data  assimilation  to  characterize  land-atmosphere 
coupling  errors  in  numerical  weather  prediction  models.  Mon.  Wea.  Rev.,  141,  1804-1821. 

Kolczynski,  W.  and  J.  Hacker,  2013:  The  potential  for  Self-Organizing  Maps  to  identify  model  error 
structures.  Mon.  Wea.  Rev.,  accepted  pending  revisions. 


10 


